Adaptive Semiparametric Language Models
نویسندگان
چکیده
Abstract We present a language model that combines large parametric neural network (i.e., transformer) with non-parametric episodic memory component in an integrated architecture. Our uses extended short-term context by caching local hidden states—similar to transformer-XL—and global long-term retrieving set of nearest neighbor tokens at each timestep. design gating function adaptively combine multiple information sources make prediction. This mechanism allows the use either context, memory, or (or any combination them) on ad hoc basis depending context. Experiments word-based and character-based modeling datasets demonstrate efficacy our proposed method compared strong baselines.
منابع مشابه
Adaptive Methods for Spatial Scan Analysis via Semiparametric Mixture Models
Spatial scan density (SSD) estimation via mixture models is an important problem in the eld of spatial statistical analysis and has wide applications in image analysis. The “borrowed strength” density estimation (BSDE) method via mixturemodels enables one to estimate the local probability density function in a random eld wherein potential similarities between the density functions for the s...
متن کاملAdaptive Bayesian Regression Splines in Semiparametric Generalized Linear Models
This paper presents a fully Bayesian approach to regression splines with automatic knot selection in generalized semiparametric models for fundamentally non Gaussian responses In a basis function representation of the regression spline we use a B spline basis The reversible jump Markov chain Monte Carlo method allows for simultaneous estimation both of the number of knots and the knot placement...
متن کاملAn Adaptive Estimation Method for Semiparametric Models and Dimension Reduction
Xia, Tong, Li and Zhu (2002) proposed a general estimation method termed minimum average variance estimation (MAVE) for semiparametric models. The method has been found very useful in estimating complicated semiparametric models (Xia, Zhang and Tong, 2004; Xia and Härdle, 2006) and general dimension reduction (Xia, 2008; Wang and Xia, 2008). The method is also convenient to combine with other m...
متن کاملRidge Stochastic Restricted Estimators in Semiparametric Linear Measurement Error Models
In this article we consider the stochastic restricted ridge estimation in semipara-metric linear models when the covariates are measured with additive errors. The development of penalized corrected likelihood method in such model is the basis for derivation of ridge estimates. The asymptotic normality of the resulting estimates are established. Also, necessary and sufficient condition...
متن کاملGeneralized Ridge Regression Estimator in Semiparametric Regression Models
In the context of ridge regression, the estimation of ridge (shrinkage) parameter plays an important role in analyzing data. Many efforts have been put to develop skills and methods of computing shrinkage estimators for different full-parametric ridge regression approaches, using eigenvalues. However, the estimation of shrinkage parameter is neglected for semiparametric regression models. The m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions of the Association for Computational Linguistics
سال: 2021
ISSN: ['2307-387X']
DOI: https://doi.org/10.1162/tacl_a_00371